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Abstract 

Subject  Matter  Experts  (SMEs)  are  commonly  used  in  cost  risk  analysis  (and  in  other 
fields  as  well)  for  values  that  either  are  not  available  in  historical  data  or  for  which  no 
appropriate  analogy  can  be  found.  Problems  commonly  arise  in  two  areas  in  particular:  (1 ) 
when  multiple  experts  give  opinions  on  a  single  effect  or  entity  and  the  inputs  are  not 
identical  in  distribution  (which  is  almost  inevitable),  and  (2)  when  a  single  expert  provides 
distributional  information  that  is  intractable  or  suspiciously  unlikely  in  its  form  (which  is 
common). 

This  paper  will  put  forward  correct  solutions  in  case  (1),  in  which  the  authors’ 
experience  shows  that  practitioners  (and  even  experts)  use  incorrect  solutions.  It  is 
important  to  note  that  the  commonly  exercised  incorrect  solution  underestimates  the 
dispersion,  and  thus  the  80th  percentile,  in  some  cases  by  a  large  margin.  The  authors 
believe  that  their  solution  is  rare  and,  further,  are  unaware  of  any  use  of  the  solution,  and 
will  recommend  tenets  to  guide  the  practitioner.  In  preparation  for  the  solutions  laid  out 
above,  the  authors  will  first  describe  the  method  of  expert-based  risk  analysis,  with  the 
erroneous  method  for  combining  SME  testimony,  and  then  show  the  correction.  An 
analytical  treatment  will  quantify  the  impacts  of  the  erroneous  approach.  The  paper  will  also 
explain  why  the  new  method  of  conflating  expert  assessments  is  to  be  preferred  to  the 
common  Delphi  technique,  which  may  fall  prey  to  both  anchoring  and  domination  by  a  vocal 
minority. 

The  paper  will  also  briefly  address  case  (2)  by  presenting  common  examples  of 
problematic  formulations  and  proposed  resolutions.  These  include  intractable  specification 
of  a  triangular  distribution,  specification  of  a  discrete  categorical  distribution  when  triangular 
was  intended,  and  specification  of  a  triangular  with  low  and  high  values  that  are  not  the  true 
extremes  as  well  as  errors  committed  by  the  risk  analyst. 

In  any  situation,  correct  treatment  of  risk  is  important.  In  the  current  era,  with  80th 
percentiles  required  for  all  weapon  systems  cost  estimates  by  the  Weapon  Systems 
Acquisition  Reform  Act  of  2009,  and  budgeting  to  the  80th  percentile  as  the  default  practice, 
the  correct  determination  of  the  distribution  is  more  important  than  ever  before. 

Overview 

Expert-based  risk  methodologies  are  a  common  approach  to  cost  risk.  Expert- 
based  risk  methodologies  are  defined  for  the  purposes  of  this  paper  as  follows. 
Notwithstanding  that  the  cost  estimate  may  be  based  on  actuals,  expert-based  risk  methods 
rely  on  elicitation  of  the  parameters  of  the  risk  distribution  from  expert  opinion.  These 
parameters  are  for  the  distribution  of  various  types  of  risk  such  as  (typically,  but  not 
exclusively)  triangles  for  cost  risk,  Bernoullis  for  technical  risk  and  occasional  normals. 
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Single  or  multiple  experts  may  offer  estimates  (expert  testimony)  of  a  particular  risk  via 
some  form  of  parameterization. 

This  paper  will  discuss  two  topics  in  correction  of  expert  testimony:  1 )  The  “best” 
approach  to  converting  extrema  and  quartiles  from  expert  opinion  into  risk  distributions,  and 
2)  The  “best”  approaches  to  conflating  multiple  views  of  the  parameterization  of  a  single  risk. 

For  completeness,  the  paper  will  also  discuss  some  difficult  characterizations  that 
they  have  encountered  and  the  approach  that  they  have  evolved  for  “correcting”  them. 
Problems  with  inconsistent  percentiles  and  problems  with  unusual  characterizations  will  both 
be  discussed. 

This  topic  was  addressed  in  general  in  a  prior  paper  by  Coleman,  Braxton,  Druker, 
Cullis,  and  Kanick  (2009)  under  the  rubric  “Omission  of  Elements  of  Variability.”  A  paper  by 
St.  Louis,  Blackburn,  and  Coleman  (1998)  espoused  a  form  of  combination  of  expert 
testimony  that  this  paper  now  recommends  against. 

The  “Best”  Approach  to  Converting  Extrema  and  Quartiles  from 
Expert  Opinion  into  Risk  Distributions 

Correcting  Extrema  and  Quartiles  for  Truncation 

The  Problem.  Our  estimated  distributions  tend  to  be  “too  tight,”  as  shown  by 
Brown  (1973)  and  Alpert  and  Raifa  (1982).  Without  feedback,  we  provide  extreme  values 
near  the  20th  percentile  and  80th  percentile  when  we  are  asked  Min  and  Max.  This  can  be 
improved,  with  feedback  to  the  10th  and  90th  percentile.  This  can  be  improved  by  asking  for 
more-extreme  values.  For  example,  “astonishingly-low-probability  outcomes”  equate  to  the 
0.1th  percentile  and  99.9th  percentile.  Without  feedback,  we  give  25th  and  75th  quartiles  that 
actually  contain  only  33%  of  the  outcomes  versus  the  expected  50%.  This  can  be  improved 
with  feedback  to  43%  versus  the  expected  50%.  Independent  investigations  of  this  over¬ 
tightness  are  remarkably  consistent  in  the  degree  to  which  it  occurs,  as  shown  by  Brown 
(1973)  and  Alpert  and  Raifa  (1982).  Our  ability  to  probabilistically  characterize  the  past  or 
future  or  to  estimate  our  certainty  on  general-knowledge  facts  are  all  about  comparable,  as 
noted  by  Lichtenstein,  Fischhoff,  and  Phillips  (1982). 

Correcting  Extrema  and  Quartiles.  For  extrema,  assume  that  experts  will 
return  20th  and  80th  percentiles  when  asked  for  the  full  range.  In  other  words,  when  given 
highs  and  lows,  assume  you  are  getting  something  more  like  standard  deviations 
masquerading  as  extrema;  it’s  not  quite  that  bad,  but  it’s  close.  It’s  about  0.316  of  the  real 
base  (see  Appendix  A).  This  could  be  presumed  to  improve  to  10th  and  90th,  but  only  if  the 
experts  can  be  assumed  to  have  gotten  specific  feedback  about  their  accuracy  at  this  task  in 
the  past.  Note  that  this  is  not  the  same  as  saying  they  are  very  well  qualified;  it  refers 
specifically  to  feedback  training.  We  believe  that  practitioners  have  mistaken  expertise  for 
being  trained  and  that  this  is  why  many  practitioners  believe  experts  provide  10th  and  90th 
percentiles.  For  quartiles,  although  we  don’t  typically  ask  for  quartiles,  we  recommend 
assuming  that  a  claimed  25-75  inter-quartile  range  is  actually  a  33-67  percentile  range.  This 
can  be  improved  to  a  28-72  range  with  specific  feedback.  The  two  distortions  above  are  not 
strictly  coherent,  meaning  that  they  yield  different  corrections.  The  full  range  case  is  a 
greater  understatement  than  the  inter-quartile  case.  The  wider  the  confidence  interval  you 
ask  for,  the  more  the  witness  will  understate  it.  When  given  expert  testimony,  therefore,  it  is 
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appropriate  to  correct  the  testimony  by  adjusting  the  standard  deviation  or  the  end  points 
using  the  two  general  results  above,  depending  on  the  form  given. 

Errors  of  Extrema — Pictorially.  The  20th  percentile  occurs  at  a  point  that  is 
0.316  of  the  base,  so  the  understatement  of  experts  is  on  the  order  of  1/3.  Pictorially,  then, 
we  are  experiencing  a  reduction  in  distribution  on  the  order  of  the  blue  (claimed)  to  the  white 
(actual)  portrayed  in  Figure  1 .  For  a  tutorial  on  computing  percentiles,  see  Appendix  A. 


Figure  1.  Visualization  of  Expert  Truncation  of  Dispersion 

The  “Best”  Approaches  to  Conflating  Multiple  Views  of  a 
Distribution 

Conflation 

By  definition,  conflation  refers  to  the  combining  of  different  (independent)  views  of  a 
thing  to  arrive  at  a  single  (better  and  more  complete)  view  of  it.  We  seek  to  conflate  expert 
testimony  principally  because  we  will  arrive  at  a  better  estimate  for  the  mean,  but,  what 
about  the  dispersion?  Conflation  is  the  most  difficult  problem  for  expert-based  risk 
methodologies;  this  is  not  immediately  obvious,  but  it  is  so.  Dispersion  is,  in  turn,  the  hard 
part  of  conflation.  Ad  hoc  conflations  are  often  used  for  k  experts  each  giving  estimates  for 
the  same  risk  or  WBS  element.  For  example: 

Use  the  individual  expert  testimonies  in  each  run  of  the  Monte  Carlo: 

a.  Make  k  random  draws  from  the  k  different  distributions  and  average 
them  (as  done  by  St.  Louis,  Blackburn,  and  Coleman  (1998)). 

b.  Make  k  random  draws  from  the  k  different  distributions  with  correlation 
and  average  them. 

1 .  Derive  the  parameters  of  a  single  distribution  from  the  parameters  of 
the  expert  testimony  and  then  Monte  Carlo: 

a.  Make  a  new  distribution  with  i)  the  mean  of  the  k  expert  means  and  ii) 
the  mean  of  the  standard  deviations,  for  normals,  as  demonstrated  by 
Brown  (1973),  or  the  means  of  the  respective  end  points  for  triangles 
[Average  the  Parameters], 

b.  Make  a  new  distribution  with  the  mean  of  the  k  experts  and  the  lowest 
low  and  the  highest  high  as  end  points. 

2.  Sampling  has  been  endorsed  by  Brown  (1973).  Sampling  is  done  as 
follows:  for  each  run  of  the  Monte  Carlo,  pick  the  answer  from  a 
randomly  selected  expert  who  provided  testimony. 
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We  will  only  examine  ad  hoc  methods  la,  2a  and  sampling.  The  others  can  be 
inferred  from  those.  Also,  note  that  in  backup,  we  prove  that  1b  and  2a  are  equivalent  for 
symmetric  triangles,  and  we  speculate  that  for  asymmetric  triangles  there  is  no  significant 
difference,  and  so  there  is  nothing  to  separate  these  beyond  ease  of  implementation. 

The  First  Question 

The  first  question  in  conflation  is  to  determine  what  we  believe  to  be  the  underlying 
model.  No  single  conflation  method  will  work  for  the  two  possible  scenarios  that  can 
confront  the  estimator,  namely  single  or  multiple  realities. 

“Single  Reality.”  There  is  a  one  (typically  uni-modal)  distribution,  which  we  do  not 
know,  but  which  experts  are  presumed  to  know  to  some  degree  of  accuracy.  Examples: 
What  is  your  estimate  for  the  GNP  of  Brazil  for  2009?  How  big  is  a  brown  bear?  What  is  the 
range  of  technical  risk  for  the  cost  of  the  engine? 

“Multiple  Realities.”  There  are  k  (typically  uni-modal)  distributions;  we  generally 
know  neither  k  nor  the  individual  distributions,  but  experts  are  presumed  to  know  at  least 
one  each  to  some  degree  of  accuracy.  Examples:  How  far  away  is  your  favorite  planet? 
(There  could  be  up  to  9  answers,  depending  on  the  inclusion  of  Pluto  and  Earth!)  How  big  is 
a  panda?  (There  is  a  lesser  panda  and  a  greater  panda,  but  we  don’t  happen  to  know  that 
and  fail  to  specify)  What  is  the  cost  risk  for  the  engine  on  the  F-35?  (There  is  a  main  and  an 
alternate  engine,  each  has  a  range.) 

This  problem  may  seem  silly,  but  it  is  not,  and  our  choice  of  conflation  methods 
depends  on  the  case  we  believe  to  apply.  We  will  recommend  approaches  for  both;  but 
first,  decide  which  case  applies.  The  amount  of  spread  in  your  expert  testimony  will  give 
you  an  idea  whether  single  or  multiple  realities  is  more  likely.  We  recommend  against 
feedback  or  “drilling  down”  until  after  testimony  is  gathered  because  witnesses  are 
notoriously  vulnerable  to  witness  leading,  anchoring  and  all  other  sorts  of  mischief;  you’ll 
never  know  if  you  lead  the  witness. 

Desiderata  for  Single  and  Multiple  Realities.  Cases  dictate  different 
characteristics  for  the  conflation  technique.  Single  reality  requires  the  best  estimate  for  the 
mean,  the  best  estimate  for  the  dispersion  and  the  best  estimate  for  the  distribution. 

Multiple  realities  dictate  the  best  portrayal  of  the  multiple  choices  we  are  confronted  with. 

We  will  discuss  each  in  turn. 

We  will  describe  the  apparent  preferred  solution  for  each  method  after  asserting 
them.  For  single  reality,  average  the  parameters  and  correct  for  the  understatement  of 
extrema  (using  method  1b  or  2a  from  above).  For  multiple  realities,  sample  from  the 
experts  after  correcting  each  for  understatement  of  the  extrema.  If  we  cannot  discern 
whether  we  are  in  single  or  multiple  reality,  then  we  recommend  sampling  because  this  is 
more  conservative,  meaning  it  will  have  wider  dispersion.  We  reject  the  use  of  averaging 
answers  on  each  iteration  despite  having  used  the  method  in  a  Conference  Best  Paper  by 
St.  Louis  et  al.  (1998).  To  see  why,  we  will  show  its  characteristics  and  indicate  why  it  is 
probably  unsuitable. 

Recommendation — Single  Reality.  The  mean  of  the  single  reality  not 
troublesome,  almost  any  reasonable  approach  will  yield  the  same  mean.  (We  use  the  word 
“reasonable”  with  trepidation.)  The  standard  deviation  presents  the  problem,  since 
individuals  are  known  to  under-report,  and  some  methods  are  vulnerable  to  distortions.  We 
recommend  averaging  parameters  of  the  expert  testimony,  as  shown  below,  because  it  is 
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clear  what  is  happening.  Correct  each  expert’s  testimony  for  truncation  of  the  standard 
deviation,  or  correct  the  average;  there  is  no  obvious  difference  in  the  order  of  the 
operations.  Techniques  for  correcting  the  standard  deviation  were  shown  in  the  first  part  of 
the  paper. 

Conflation:  Averaging  on  Each  Iteration.  Averaging  on  each  iteration  can 
have  an  unexpected  result:  Three  very  different  sets  of  testimony  by  two  experts  will 
produce  exactly  the  same  picture.  This  is  not  obvious  at  first,  but  it  is  so.  The  standard 
deviation  of  k  identical  but  scattered  triangles,  with  SD  =  s,  when  iteration-averaged  will 
produce  a  standard  deviation  s/Vk.  The  SD  of  the  conflation  can  be  thus  be  arbitrarily  small, 
if  k  is  sufficiently  large.  This  does  not  comport  with  our  desire  that  the  SD  be  well  modeled. 
Correction  for  k  can  be  achieved  by  a  spreading  with  Vk,  but  this  is  likely  to  be  done  wrong 
or  omitted  altogether,  and  at  best,  would  require  row-by-row  corrections.  Correction  for 
expert  truncation  can  be  achieved  by  treating  the  end  points  as  if  they  were  20/80  points; 
this  can  be  done  before  or  after  conflation. 


Averaging 


Figure  2.  Conflation  by  Averaging  on  Each  Iteration 

We  conclude  that  averaging  on  each  distribution  has  some  good  and  bad 
characteristics  but,  on  the  whole,  is  not  desirable.  It  produces  a  good  confidence  interval  for 
the  mean  of  the  experts,  but  this  is  not  what  we  want.  We  already  know  the  mean  of  the 
experts;  the  point  estimate  is  the  simple  average  of  the  means  of  each.  What  we  really  want 
is  the  full  range  of  the  possible  outcomes,  but  averaging  on  each  iteration  does  not  do  this; 
instead,  it  shrinks  the  answer.  By  analogy,  this  is  the  same  problem  as  the  confidence 
interval  for  a  CER  ...  it  bounds  the  line,  but  not  the  data  . . .  what  we  really  want  is  the 
prediction  interval.  It  is  only  a  candidate  (and  flawed  at  that)  for  clear  cases  of  single  reality. 

Conflation:  Averaging  Parameters.  Averaging  parameters  provides  simple 
results:  Three  very  different  sets  of  testimony  by  two  experts  will  produce  exactly  the  same 
picture.  The  standard  deviation  of  k  identical  but  scattered  triangles,  with  standard  deviation 
of  s,  when  iteration-averaged  will  produce  a  standard  deviation  s.  The  standard  deviation  of 
the  conflation  will  not  vary  with  k.  Correction  can  be  achieved  by  a  spreading  with  Vk,  but 
this  is  likely  to  be  done  wrong  or  omitted  altogether  and,  at  best,  would  require  row-by-row 
corrections. 
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Two  Triangular  PDFs 
Conflated  by  Parameter  Averaging 


Figure  3.  Conflation  by  Averaging  Parameters 

We  conclude  that  averaging  parameters  has  some  good  and  bad  characteristics  but, 
on  the  whole,  is  simple  and  wieldy.  It  produces  good  estimates  of  the  mean  and  the 
standard  deviation.  It  is  insensitive  to  scatter  of  expert  testimony,  so  is  only  useable  in  clear 
cases  of  single  reality.  Correct  the  parameters  as  shown  earlier  because  each  expert  is 
likely  to  truncate.  The  order  of  the  operations  does  not  matter. 

Conflation:  Sampling  “Average.”  The  probability  distributions  of  the  k  experts, 
using  one  of  two  schemes,  depending  on  the  speed  implications  and  the  ease  of 
implementation  in  your  model.  Put  all  the  distributions  in  the  mix,  and  scale  each  by  1/k, 
creating  a  (probably)  multi-mode  custom  distribution,  as  recommended  by  Brown  (1973). 

We  will  see  this  pictorially  on  the  next  slide.  Alternatively,  characterize  each  of  the  k 
distributions  and  choose  a  first  random  number  to  select  which  expert  distribution  to  use  for 
each  run  of  the  Monte  Carlo  and  a  second  random  number  to  draw  from  that  expert’s 
distribution,  as  used  by  Flynn  et  al.  (2010).  The  two  methods  are  mathematically  identical. 
The  resulting  distribution  will  have  two  characteristics:  1)  a  better  estimate  of  the  mean  and, 
generally,  better  predictive  performance  than  other  conflation  schemes;  2)  a  wider  (actually, 
“not  narrower”)  standard  deviation  for  the  conflated  result  than  those  of  the  original 
individual  distributions.  We  don’t  know  the  degree  to  which  conflation  will  correct  dispersion, 
although  the  more  experts  the  wider  the  dispersion;  we  plan  to  attempt  a  study  of  this.  We 
will  give  a  demonstration  of  this  effect  with  representative  data. 

To  conflate  two  triangular  distributions,  “average”  them  as  illustrated  in  Figure  4. 
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The  charts  in  Figure  5  portray  the  conflation  of  two  triangles  as  the  respective 
experts  who  estimated  them  come  into  alignment.  Each  original  individual  triangle  is 
symmetric,  has  a  base  length  of  200,  and  a  standard  deviation  of  40.8.  Conflation  is  done 
by  averaging  the  two  probability  density  functions  (PDFs),  (also  described  as  sampling). 

The  two  triangles  move  closer  in  such  a  way  that  the  conflated  mean  remains  constant  at 
200  to  allow  us  to  discuss  the  CV  in  a  meaningful  way.  When  the  two  triangles  merge,  we 
get  a  triangle  that  has  the  height  and  width  of  each  individual  triangle  before  conflation.  The 
standard  deviation  of  the  conflated  distribution  will  be  shown  in  Figure  6. 
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Figure  5.  Conflation  of  Two  Triangles  by  Sampling  Maintaining  a  Constant 

Mean 


As  two  triangular  PDFs  move  closer,  the  conflated  standard  deviation  and  CV  drop 
until  the  triangles  merge  and  achieve  the  same  standard  deviation  as  that  of  each  triangle. 
Since  we  chose  to  maintain  the  mean  of  the  conflation  at  200,  the  CV  drops.  The  unsettling 
conclusion  is  that  the  CV  of  conflated  expert  opinion  can  be  uncontrollably  large,  depending 
on  how  far  apart  their  triangles.  Note  that  the  variance  of  two  identical  triangles  separated 
by  distance  2d  can  be  shown  to  be  V(o2+d2),  which  we  prove  in  Appendix  A. 
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Figure  6.  The  Standard  Deviation  and  Coefficient  Deviation  of  Two  Sampled  Triangles 

as  a  Function  of  Their  Separation 


The  Dispersion  of  Sampled  Distributions 

Let: 

o  =  SD  of  the  underlying  risk 

Se  =  SD  for  the  individual  experts  (we  think  it  is  about  Vao) 

Sm  =  SD  for  the  meta-distribution  of  the  experts  opinions 

Sc  =  SD  of  the  conflation 

Then,  by  examination, 

if  Se  =  0,  then  Sc  =  Sm 

if  Sm  =  0,  then  Sc  =  Se 

And,  further 

Sc  >  max(Se,  Sm) 

This  also  implies  that  if  Se  is  corrected  to  a,  then  Sc  exceeds  a.  We  have  shown,  in 
backup,  that  once  the  experts  have  produced  k  triangles,  then: 

Sc=V(Se2+St2) 

where  St  is  the  calculated  sum  of  the  squares  of  the  differences  of  the  k  triangles  from  their 
means.  We  have  yet  to  prove  that 

Sc=  V(  Se2+Sm2) 

but  we  believe  it  to  be  true. 

Thoughts  on  the  Distribution  of  Expert  Opinion.  We  will  now  speculate  on 
the  distribution  of  the  experts  themselves,  which  we  have  come  to  call  the  meta-distribution. 
Our  assumptions  are  that:  1 )  Experts  will  not  be  versed  in  the  distribution  of  costs,  but  will 
be  estimating  the  distribution  based  on  the  outcomes  they  have  experienced  and  perhaps 
some  hearsay;  and  2)  Experts  are  most  likely  to  be  technical  people,  not  cost  estimators,  so 
will  have  experience  in  a  handful  of  projects  and  hearsay  of  somewhat  larger  number. 
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The  implications  of  the  above  are  that:  1)  Experts  will  perceive  a  mean  (and  perhaps 
the  mode?)  according  to  Chebyshev's  inequality  or  a  confidence  interval  bounded  by  o/(Vn), 
at  best,  where  n  is  the  number  they  have  observed;  and  2)  Experts  will  perceive  a  standard 
deviation  as  a  variance  a  times  a  chi-square  (n)  divided  by  n,  at  best. 

The  above  thoughts  do  not  yet  consider  the  implications  of  truncation  of  the  value  of 
o,  but  this  needs  to  be  incorporated. 

Combining  Corrections  for  Extrema  and  Conflation.  We  have  shown  that 
individual  distributions  can  be  corrected  for  a  consistent  pattern  of  understatement.  We 
have  shown  that  sampling  of  multiple  experts  will  improve  the  mean  and  widen  the  spread. 
But,  we  don’t  have  a  good  sense  of  how  much  the  spread  will  be  improved.  The  implication 
is  that  we  should  not  endeavor  to  both  expand  and  sample  expert  distributions.  If  we  correct 
the  individual  distributions,  then  we  will  have  the  dispersion  “about  right.”  If  we  then  sample 
them,  then  we  will  have  a  dispersion  that  exceeds  “about  right.”  So,  for  “single  reality,”  do 
one  or  the  other,  but  not  both.  Expansion  of  a  single  distribution  focuses  on  the  dispersion. 
Sampling  of  diverse  experts  focuses  on  getting  the  mean  right.  Since  we  generally 
recommend  correcting  lower  order  moments  first,  as  recommended  by  Coleman, 
Summerville,  and  Gupta  (2002),  sampling  is  the  priority.  Sampling  of  each  distribution  has 
excellent  characteristics;  it  replicates  what  the  experts  told  us  exactly.  It  has  a  problem  in 
use  for  a  single  reality  situation  because  the  standard  deviation  is  not  easily  correctible  for 
scatter  nor  is  it  useable  without  correction.  We  can  easily  correct  each  expert’s  testimony 
for  truncation,  but  we  cannot  undo  the  growth  caused  by  expert  scatter,  which  is 
theoretically  unbounded  ...  the  adjustment  would  be  a  function  of  k,  the  number  of  experts, 
and  has  yet  to  be  ascertained.  We  conclude  that,  despite  its  popularity  in  the  literature,  the 
sampling  technique  is  too  tricky  in  a  single  reality  case  and  should  not  be  used. 

Recommendation — Multiple  Realities.  The  mean  of  the  multiple  realities  case 
is  not  troublesome;  almost  any  reasonable  approach  will  yield  the  same  mean.  (Again,  that 
dangerous  word  “reasonable”!)  The  standard  deviation  does  not  present  as  much  of  a 
problem  in  a  multiple  reality  case  because  we  believe  each  expert,  like  the  six  blind  men, 
sees  a  piece  of  the  truth.  We  recommend  using  sampling.  Be  sure  to  correct  each  expert’s 
testimony  before  sampling;  you  cannot  easily  correct  it  afterwards — order  matters. 

Conclusion  for  the  Conflation  of  Single  and  Multiple  Realities 

As  asserted,  we  have  illustrated  that  the  averaging  of  parameters  for  k  triangles,  is 
equivalent  to  averaging  of  draws  from  those  k  triangles  with  a  single  draw  of  a  random 
number  used  to  simulate  expert’s  draw,  and  then  averaging  the  draws.  We  have 
demonstrated  why  those  two  equivalent  methods  give  the  simplest  and  clearest  result  for 
single  reality  and  seem  the  best  representation  of  what  the  k  experts  seem  to  have  meant. 
We  have  shown  why  sampling  of  k  experts  gives  the  best  representation  of  what  the  k 
experts  seem  to  have  meant  in  the  case  of  multiple  realities.  The  issue  of  deciding  between 
single  and  multiple  realities  remains  the  most  difficult  issue.  Sometimes  it  will  be  as  simple 
as  learning  that  each  expert  has  in  mind  “a  different  engine,”  and  sometimes  it  will  be  a 
concession  to  the  wide  dispersion  and  the  recognition  that  there  “must  be  a  reason.”  We 
will  now  move  to  a  different  topic,  that  of  correcting  mischaracterization  of  distributions, 
without  which  this  paper  would  seem  incomplete. 

Correcting  the  (Mis)characterization  of  Distributions.  The  problem  is  that 
“experts”  who  may  know  a  lot  about  the  technical  issues,  and  maybe  even  the  cost  of  them, 
will  not  necessarily  be  well  versed  in  probability.  Consequently,  the  characterizations  they 
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will  produce  will  not  be  easily  used  and  will  sometimes  be  incoherent  (meaning,  internally 
contradictory).  That  said,  expert  testimony  in  risk  analysis  should  be  accorded  the  same 
respect  that  cost  data  is  in  cost  analysis.  We  recommend  three  tenets  in  correcting 
apparently  erroneous  expert  testimony.  We  will  list  them,  and  we  will  apply  them  in  several 
actual  examples  of  errors  the  authors  have  encountered,  chosen  because  they  are  the  most 
common. 

Tenet  1.  “Do  no  harm,”  meaning  preserve  as  much  of  what  the  expert  said  as  is 
possible  in  achieving  coherence. 

Tenet  2.  Preserve  lower  order  moments  above  higher  order  moments. 

Tenet  3.  If  particular  aspects  are  more  important  than  others,  preserve  those 
aspects  (e.g.,  if  the  variability  or  upper  percentiles  are  the  focus,  accord  that  greater  priority). 

When  making  corrections,  it  is  preferable  to  make  the  corrections  with  direct 
feedback  to  the  expert,  but  this  feedback  should  be  done  under  the  same  precepts  as  the 
corrections,  meaning  follow  the  tenets  in  your  persuasions  and  probing. 

Example  One — Implausible  Percentiles.  The  expert  told  us  that  “The  20/50/80 
are  $0.0M/$0.9M/$3.6M.”  The  difficulty  is  that  no  triangle  can  fit  this,  and  the  distribution  is 
very  skewed,  so  simplifying  steps  were  taken.  We  assumed  that  the  stated  “50%  percentile” 
is  really  the  mode.  We  took  the  20  and  80  as  “about  true,”  and  assume  they  are  ±o.  We 
used  the  rule  that  the  half-base  lengths  of  a  symmetric  triangle  are  a/6*o.  We  noted  that 
these  triangles  are  not  symmetrical,  but  we  still  used  it  as  a  factor  that  probably  does  a 
decent  job.  The  results  are  in  the  table  in  Figure  7. 


Inputs 

Out 

puts 

20%-ile 

0 

L 

-1.305 

50%-ile 

0.9 

M 

0.900 

80%-ile 

3.6 

H 

7.514 

Figure  7.  Table  of  Inputs  and  Corrections 


Note  that  the  correction  may  be  distorting  the  central  tendency,  but  this  distribution  is 
clearly  intended  to  be  skewed,  and  the  mean  is  therefore  above  the  median.  We  cannot 
actually  compute  the  mean  with  the  information  given.  We  also  knew  that  in  this  analysis, 
the  ROS  at  the  80th  percentile  was  a  particular  focus,  so  we  felt  that  preservation  of  that 
point  should  take  priority  (Tenet  3). 

Example  2 — Unlikely  Distributions.  The  expert  gave  us  three  discrete  points: 
20%  probability  of  -$2M,  40%  probability  of  $0,  20%  probability  of  +$4M.  Suspecting  that 
this  was  a  just  clumsy  way  to  characterize  a  triangle,  we  asked  if  a  triangle  with  the  below 
characteristics  was  along  the  lines  of  what  the  expert  meant:  20%  percentile  =-$2M,  Mode 
=  0M,  80th  percentile  =  +$4M.  The  expert  agreed  readily  that  the  precise  distribution  wasn’t 
what  he  meant,  and  the  triangle  captured  the  sense  of  it. 

Example  3 — Errors  of  Characterization  Induced  by  the  Risk  Analyst. 

Below  are  three  typical  errors  of  characterization  introduced  by  the  risk  analyst  after  the 
expert  has  given  his  testimony.  They  are  actual  examples,  chosen  because  they  are  the 
most  common. 

Categorical  Risk  Distributions.  Risk  models  cannot  always  easily  (or,  rather, 
obviously)  implement  a  categorical  random  variable  beyond  a  Bernoulli.  Categorical  risk 
distributions  are  like  Bernoulli’s  but  allow  2  or  more  values  (the  Bernoulli  is  a  member  of  the 
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categorical  family.)  Many  models  can  handle  categoricals,  but  most  analysts  don’t  realize 
that.  For  a  3-value  categorical,  with  choices  of  0,  1  and  2,  many  analysts  implement  it  as 
two  independent  Bernoulli’s  with  values  of  0  or  1  and  0  or  2.  This  is  an  error  as  the  results 
are  not  the  same,  the  two  Bernoulli’s  can  turn  out  as  1  and  2  at  the  same  time,  but  the 
original  formulation  prohibits  that.  To  fix  this  problem,  either  implement  it  as  a  categorical  or 
create  two  Bernoulli’s  with  the  right  characteristics. 

Triangular  Risk  Distributions.  Sometimes  the  end  points  are  set  at  the  standard 
deviation  of  the  formulation;  sometimes  triangles  are  used  instead  of  normals,  even  when 
the  normal  was  proposed — out  of  aversion  to  negative  outcomes — even  though  in  practice, 
negative  outcomes  are  harmless  in  Monte  Carlo;  negative  outcomes  ought  to  be  fairly  rare 
anyway. 

Normals.  Sometimes  triangles  are  substituted  incorrectly  (see  above.)  If  the  mean 
and  standard  deviation  are  captured  correctly,  then  there  is  little  harm;  but  this  is  often  not 
done  right.  Sometimes  the  negative  portion  of  the  normal  is  truncated,  despite  that  this 
causes  a  shift  of  the  formulated  mean  and  a  reduction  in  the  standard  deviation. 

Conclusion  for  Correcting  Mischaracterizations.  We  have  presented  tenets 
by  which  apparent  errors  of  characterization  may  be  corrected  and  have  listed  the  most 
common  risk-analyst-induced  errors.  We  finish  by  reiterating  that  the  testimony  of  the 
experts  we  consult  should  be  handled  much  as  we  should  handle  data.  We  must  be  careful 
in  not  ignoring  the  symptoms  of  the  testimony  and  avoid  such  elementary  errors  as  causing 
anchoring*  and  “leading  the  witness.”  We  should,  nonetheless,  carefully  repair  any  clear 
errors  caused  by  the  unfamiliarity  with  probability  that  can  result  in  unlikely  distributions. 

Final  Thoughts 

The  conflation  of  expert  testimony  has  received  some  attention  in  the  literature,  but 
the  conclusions  seem  to  have  permeated  the  cost  risk  discipline.  We  hope  that  we  have 
provided  a  reasonably  thorough  paper  by  which  risk  analysts  might  be  guided.  We  also 
hope  that  we  have  provided  a  few  good  tenets  for  correcting  mischaracterization,  along  with 
some  illustrative  (actual)  examples. 

We  hope  to  be  able  to  take  on  the  issue  of  what  we  call  the  meta-distribution,  the 
likely  distribution  of  individual  expert  testimony.  Without  a  good  model  for  the  meta¬ 
distribution,  the  full  demonstration  of  the  best  answers  will  remain  incomplete  because  the 
meta-distribution  is  the  unseen  ground  truth  against  which  these  answers  can  be  measured. 
Until  we  can  be  satisfied  we  have  the  meta-distribution,  we  are  confined  to  showing  the 
behavior  of  various  methods  and  deciding  if  that  behavior  seems  correct. 
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Appendix  A.  Derivations  and  Proofs 

The  Geometry  of  Symmetric  Triangles.  For  a  symmetric  Triangle(L,  M,  H), 
where  M-L  =  H-M,  find  points  I  and  h  such  that  I  and  h  are  the  pth  and  1-pth  percentiles  (see 
Figure  8). 

If  l-L  =  1/4*(H-L),  H-h  =  1/4*(H-L),  then  p  =  2*(1/4)2  =  1/8  =  12.5% 

If  l-L  =  1/9*(H-L),  H-h  =  1/9*(H-M),  then  p  =  2*(1/9)2  =  1/18  =  5.6% 

The  pth  percentile  corresponds  to  the  V(p/2)  base  fraction,  so  the  20th  percentile, 
expressed  as  1/5,  occurs  at  point  V(1/10)  =  0.316228  base  fraction. 


Figure  8.  Visual  Aid  to  Demonstrate  the  Relationship  of  Percentiles  and  Base 

Fraction 

Triangles  with  Related  Areas.  We  wish  to  know  how  to  draw  triangular 
distributions  that  are  related  to  one  another  for  our  illustrations: 


Triangles  of  Constant  Area.  For  area  to  remain  constant,  in  this  case  A  =  1,  as 
the  base  increases  by  a  factor,  the  height  must  be  multiplied  by  the  reciprocal  of  that  factor: 


A  = 


—bh  =  —(bk 
2  2 
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Similar  Triangles  of  Reduced  Area.  The  dimensions  of  a  similar  triangle  must 
be  reduced  by  the  square  root  of  that  factor: 


f  u  \ 


A2  =  —  A,  =  —hh  =— 
2  k  1  2k  2 


hi 


\4k  )\4k  ) 

Reduction  of  Height  to  Reduce  Area  with  Constant  Base.  For  area  to  be 
reduced  by  a  factor,  the  height  must  be  reduced  by  that  factor,  if  the  base  is  to  remain 
constant: 

A2  =  ~A  =—bxhx  =—  {bji  — 

2  k  1  2k  1  1  2V  l\k) 


Triangular  Distribution — PDF  and  Mean.  Fora  Triangle(L,ML,H),  denote  L  = 
a,  H  =  b,  ML  =  c  denoted  T(a,c,b).  Since  the  area  of  the  triangle  must  be  1  (100%),  the 
height  is  twice  the  reciprocal  of  the  base.  We  can  then  derive  the  PDF  by  using  similar 
triangles. 
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3  3  3  3  3  3 
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Triangular  Distribution — Variance 

cj2=e[{X-p)2\=e{x2)-M2 
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_  a2  +b2  +c2  -ab-ac-bc  _b2  -  lab  +  a2  +  c2  +  ab  -  ac  -  be  _  (b-af  —  {b  —  c\c  -a) 
~  18  ~  18  _  18 

Note  that  the  variance  is  thus  the  square  of  the  base  minus  the  product  of  the  half¬ 


bases. 


J 

INK! 
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Substituting  a  Triangular  for  a  Normal:  The  V6  Factor.  For  a  symmetric 
Triangle(L,  ML,  H),  let  ML  =  m,  L  =  m-w,  H  =  m  +  w,  where  w  is  the  half-base.  Then  the 
mean  is  m,  and  the  variance  is  w2/6  (see  previous  proofs)  and  the  variance  is  thus  w/V6.  It 
follows  that  the  half-base  is  greater  than  the  standard  deviation  by  a  factor  of  a/6.  So,  to 
approximate  a  normal,  the  factor  of  V6  is  multiplied  by  the  standard  deviation  of  the  original 
normal  to  be  emulated  to  produce  the  half-base  of  the  triangle  we  wish  to  use  in  emulation. 
By  this  means,  end  points  are  found  that  will  produce  a  triangular  distribution  to  emulate  the 
underlying  Normal(p,  a)  in  mean  and  standard  deviation.  This  symmetrical  triangular 
distribution,  Triangle(p-V6o,  p,  p+V6o)  differs  from  the  underlying  normal  in  all  other 
moments,  and  at  all  percentiles  other  than  the  median  and  two  “cross-over”  points,  but  the 
difference  is  minor,  as  shown  in  Figure  A-2. 


Figure  9.  Comparison  of  Triangle(p-V6o,  p,  p+V6o)  and  Normal(p,  a) 

Variance  of  Hybrid  Distributions — A  Pythagorean  Relationship.  The 

Mean  Suppose  k  distributions  with  PDF  Pi(Xi),  mean  pi,  and  standard  deviation  Oi  are 
sampled.  Then  the  PDF  of  the  hybrid  distribution  is  the  “average”  of  the  PDFs: 


k  i= i 

The  mean  of  the  hybrid  distribution  is  the  average  of  th^means 

i  ^  ,  2>' 

/-l  =  E(x)  =  -YJ]  X,P, (t K  = 

IC  i=i  K 

The  variance  of  the  hybrid  distribution  is  the  average  of  the  variances  plus  the 
variance  of  the  means  taken  as  a  discrete  probability  distribution!  See  the  next  proof  for  the 
derivation  of  the  variance. 


Variance  of  Hybrid  Distributions — A  Pythagorean  Relationship — The 


Variance. 
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k  k 


IX  IX 

1=1  1=1 


In  the  special  case  of  two  congruent  distributions  with  centers  at  m-d  and  m+d,  the 
variance  is: 


2 


Equivalence  of  Averaging  Distributions  and  Averaging  Parameters  for 
Symmetric  Triangles.  In  the  case  of  symmetric  triangles,  averaging  the  individual 
triangles  (with  perfect  rank  correlation)  can  be  shown  to  be  equivalent  to  averaging  the 
parameters.  We  will  prove  it  in  the  case  of  two  triangles,  but  the  proof  can  easily  be 
extended  to  more. 

As  previously  shown,  the  pth  percentile  (p<0.5)  for  a  symmetric  triangle  is  at  the  V(2p)  half¬ 
base  fraction,  so  the  pth  percentiles  of  the  two  triangles  and  their  average  are: 


But  this  is  clearly  just  the  pth  percentile  of  the  average  distribution.  A  similar  proof 
works  for  p>0.5.  Since  all  percentiles  are  equal,  the  resulting  distributions  are  identical. 
Monte  Carlo  simulation  could  be  used  to  explore  the  difference  between  the  two  methods  for 
asymmetric  triangles,  but  it  is  expected  to  be  small. 
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►  Abstract 


. 

Subject  Matter  Experts  (SMEs)  are  commonly  used  in  cost  risk  analysis  (and  in  other  fields  as  well)  for 
values  that  either  are  not  available  in  historical  data  or  for  which  no  appropriate  analogy  can  be  found. 
Problems  commonly  arise  in  two  areas  in  particular:  (1)  when  multiple  experts  give  opinions  on  a 
single  effect  or  entity  and  the  inputs  are  not  identical  in  distribution  (which  is  almost  inevitable);  and 
(2)  when  a  single  expert  provides  distributional  information  that  is  intractable  or  suspiciously  unlikely 
in  its  form  (which  is  common). 

This  paper  will  put  forward  a  correct  solutions  in  case  (1),  where  the  authors'  experience  shows  that 
practitioners  (and  even  experts)  use  incorrect  solutions.  It  is  important  to  note  that  the  commonly 
exercised  incorrect  solution  underestimates  the  dispersion,  and  thus  the  80th  percentile,  in  some  cases 
by  a  large  margin.  The  authors  believe  that  their  solution  is  rare  and  further  are  unaware  of  any  use 
of  the  solution,  and  will  recommend  tenets  to  guide  the  practitioner.  In  preparation  for  the  solutions 
laid  out  above,  the  authors  will  first  describe  the  method  of  expert-based  risk  analysis,  with  the 
erroneous  method  for  combining  SME  testimony,  and  then  show  the  correction.  An  analytical 
treatment  will  quantify  the  impacts  of  the  erroneous  approach.  The  paper  will  also  explain  why  the 
new  method  of  conflating  expert  assessments  is  to  be  preferred  to  the  common  Delphi  technique, 
which  may  fall  prey  to  both  anchoring  and  domination  by  a  vocal  minority. 

The  paper  will  also  briefly  address  case  (2)  by  presenting  common  examples  of  problematic 
formulations  and  proposed  resolutions.  These  include  intractable  specification  of  a  triangular 
distribution;  specification  of  a  discrete  categorical  distribution  when  triangular  was  intended;  and 
specification  of  a  triangular  with  low  and  high  values  that  are  not  the  true  extremes. 

In  any  situation,  correct  treatment  of  risk  is  important.  In  the  current  era,  with  80th  percentiles 
required  for  all  weapon  systems  cost  estimates  by  the  Weapon  Systems  Acquisition  Reform  Act  of 
2009,  and  budgeting  to  the  80th  percentile  as  the  default  practice,  the  correct  determination  of  the 
distribution  is  more  important  than  ever  before. 
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►  Problem  Statement 


Expert-based  risk  methodologies  are  a  common  approach  to  cost  risk 
Expert-based  risk  methodologies  are  defined  for  the  purposes  of  this  paper  as  follows: 

Notwithstanding  that  the  cost  estimate  may  be  based  on  actuals,  expert-based  risk  methods  rely  on  elicitation 
of  the  parameters  of  the  risk  distribution  from  expert  opinion 
Typically  triangles  for  cost  risk 
Typically  Bernoullis  for  technical  risk 
May  include  normals 

Single  or  multiple  experts  may  offer  estimates  of  a  particular  risk  via  some  form  of  parameterization 

This  paper  will  discuss  two  topics 

The  "best"  approach  to  converting  extrema  and  quartiles  from  expert  opinion  into  risk  distributions 
The  "best"  approaches  to  conflating  multiple  views  of  the  parameterization  of  a  single  risk 

For  completeness,  the  paper  will  also  discuss  some  difficult  characterizations  that  they 
have  encountered  and  the  approach  that  they  have  evolved  for  "correcting"  them 

Inconsistent  percentiles 
Unusual  characterizations 


This  topic  was  addressed  in  general  in  a  prior  paper1  under  the  rubric  "Omission  Of 
Elements  Of  Variability" 


A  confession:  A  prior  paper2  espoused  a  form  of  combination  of  expert  testimony  that  this 
paper  now  recommends  against 


| .^re  We  at  the  50th  Percentile  Now  and  Can  We  Estimate  to  the  80th?  Richard  L.  Coleman  ( TASC ),  Peter  J.  Braxton  (TASC)  ,  Eric  R.  Druker  (BAH),  Bethia  L.  Cullis  (TASC),  Christina  M.  Kanick 
(TASC) 

|.  |R isk  Analysis  of  a  Major  Government  Information  Production  System ,  Expert-Opinion-Based  Software  2.  Cost  Risk  Analysis  Methodology,  N.  L.  St.  Louis,  F.  K.  Blackburn,  R.  L.  Coleman 

SCEA/ISPA  International  Conference  1998,  ADoDCAS  1998,  Journal  of  Parametrics,  June  1998,  Awarded  DoDCAS  Outstanding  Contributed  Paper  and  Overall  Best  Paper  Award  SCEA/ISPA 
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The  "Best"  Approach  To  Converting  Extrema 
And  Quartiles  From  Expert  Opinion  Into  Risk 

Distributions 
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►  Correcting  Extrema  and  Quartiles  for  Truncation 


►  Our  estimated  distributions  tend  to  be  "too  tight"3-4 

-  Extrema 

Without  feedback,  we  provide  values  near  the  20th  %-ile  and  80th 
%-ile  when  we  are  asked  Min  and  Max 

This  can  be  improved,  with  feedback  to  the  10th  and  90th  %-iles 
This  can  be  improved  by  asking  for  more-extreme  values: 

"Astonishingly-low-probability  outcomes"  equate  to  the  0.1th 
%-ile  and  99.9th  %-ile 

-  Quartiles 

Without  feedback,  we  give  25th  and  75th  quartiles  that  actually 
contain  only  33%  of  the  outcomes  vs.  the  expected  50% 

This  can  be  improved  with  feedback  to  43%  vs.  the  expected  50% 

Independent  investigations  of  this  over-tightness  are 
remarkably  consistent  in  the  degree  to  which  it  occurs3'4 

-  Our  ability  to  probabilistically  characterize  the  past  or  future  or 
to  estimate  our  certainty  on  general-knowledge  facts  are  all 
about  comparable5 


3.  Judgment  under  uncertainty ;  Heuristics  and  biases ,  Edited  by  Daniel  Kahneman,  Paul  Slovic  and  Amos  Tversky,  Cambridge  University  Press,  1982,  Chapter  21,  A  progress  report 

on  the  training  of  probability  assessors,  Alpert  &  Raiffa 

4.  An  experiment  in  Probabilistic  Forecasting ,  Thomas  A.  Brown,  R-944-ARPA,  July  1973 

5.  Judgment  under  uncertainty;  Heuristics  and  biases ,  Edited  by  Daniel  Kahneman,  Paul  Slovic  and  Amos  Tversky,  Cambridge  University  Press,  1982,  Chapter  22,  Calibration  of 

Probabilities:  the  state  of  the  art  to  1980  Lichtenstein,  Fischhoff  &  Phillips 
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►  Correcting  Extrema  and  Quartiles  -  Two  Views 


1  Assume  that  experts  will  return  20th  and  80th  percentiles  when  asked  for  the  full 

range 

In  other  words,  when  given  highs  and  lows,  assume  you  are  getting  something  more  like 
standard  deviations  masquerading  as  extrema,  it's  not  quite  that  bad,  but  it's  close,  it's 
about  .316  of  the  real  base* 

This  could  be  presumed  to  improve  to  10th  and  90th  but  only  if  the  experts  can  be 
assumed  to  have  gotten  specific  feedback  about  their  accuracy  at  this  task  in  the  past 

Note  that  this  is  not  the  same  as  saying  they  are  very  well  qualified,  it  refers  specifically  to  feedback 
training 

We  believe  that  practitioners  have  mistaken  expertise  for  being  trained  and  that  this  is  why  many 
practitioners  believe  experts  provide  10th  and  90th  percentiles 

Although  we  don't  typically  ask  for  quartiles,  we  recommend  assuming  that  a 
claimed  25-75  inter-quartile  range  is  actually  a  33-67  percentile  range 
This  can  be  improved  to  a  28-72  range  with  specific  feedback 

The  two  distortions  above  are  not  strictly  coherent,  meaning  that  they  yield 
different  corrections 

The  full  range  case  is  a  greater  understatement  than  the  interquartile  case 

The  wider  the  confidence  interval  you  ask  for ,  the  more  the  witness  will  understate  it 


When  given  expert  testimony,  therefore,  it  is  appropriate  to  correct  the 
testimony  by  adjusting  the  standard  deviation  or  the  end  points  using  the  two 
general  results  above,  depending  on  the  form  given* 


*  See  the  backup 
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►  Errors  of  Extrema  -  Pictorially 


We  note  that  experts  appear  to  be  providing 
approximately  the  20th  and  80th  percentiles 

i  ►  We  know*  that  the  20th  percentile  occurs  at  a  point  that  is 
V(l/10)  =  0.316  of  the  base 

-  The  understatement  of  variance  by  experts  is  on  the  order  of  2.5 

►  Pictorially,  then,  we  are  experiencing  a  reduction  in 
distribution  on  the  order  of  the  blue  (claimed)  to  the 
white  (actual)  portrayed  below 


Each  triangle  has  area  A 


*  For  the  geometry  of  triangles  with  regard  to  percentiles  and  area,  see  the  backup 

_ A 


The  "Best"  Approaches  To  Conflating  Multiple 

Views  Of  A  Distribution 


▲ 
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►  Conflation  of  Expert  Information 


Conflation  refers  to  the  combining  of  different  (independent)  views  of  a  thing  to  arrive  at  a  single  (better,  and 
more  complete)  view  of  it 

We  seek  to  conflate  expert  testimony  principally  because  we  will  arrive  at  a  better  estimate  for  the  mean 
But,  what  about  the  dispersion? 

Conflation  is  the  most  difficult  problem  for  expert-based  risk  methodologies 
This  is  not  immediately  obvious,  but  it  is  so 
Dispersion  is  in  turn  the  hard  part  of  the  conflation 

Ad  hoc  conflations  are  often  used  for  k  experts  each  giving  estimates  for  the  same  risk  or  WBS  element,  e.g.: 

Use  the  individual  expert  testimonies  in  each  run  of  the  Monte  Carlo: 

Make  k  random  draws  from  the  k  different  distributions  and  average  them6 

Make  k  random  draws  from  the  k  different  distributions  with  correlation  and  average  them 

Derive  the  parameters  of  a  single  distribution  from  the  parameters  of  the  expert  testimony  and  then  Monte  Carlo 

Make  a  new  distribution  with  i)  the  mean  of  the  k  expert  means  and  ii)  the  mean  of  the  standard  deviations,  for  normals7,  or 
the  means  of  the  respective  end  points  for  triangles  [Average  the  Parameters] 

Make  a  new  distribution  with  the  average  mode  of  the  k  experts  and  the  lowest  low  and  the  highest  high  as  end  points 

Make  a  new  distribution  with  the  average  mean  of  the  k  experts  and  the  lowest  low  and  the  highest  high  as  end  points 

Sampling  has  been  endorsed  in  the  literature7 

For  each  run  of  the  Monte  Carlo,  pick  the  answer  from  a  randomly  selected  expert  who  provided  testimony 

We  will  examine  each  of  these  methods 


In  backup  we  prove  that  lb  and  2a  are  equivalent  for  symmetric  triangles  and  we  speculate  that  for  asymmetric 
triangles  there  is  no  significant  difference,  and  so  there  is  nothing  to  separate  these  beyond  ease  of  implementation 


►  The  First  Question 


No  single  conflation  method  will  work  for  the  two  possible  scenarios  that  can 
confront  the  estimator 

"Single  Reality":  There  is  a  one  (typically  uni-modal)  distribution,  which  we  do  not 
know,  but  which  experts  are  presumed  to  know  to  some  degree  of  accuracy 
Example:  What  is  your  estimate  for  the  GNP  of  Brazil  for  2009? 

Example:  How  big  is  a  brown  bear? 

Example:  What  is  the  range  of  technical  risk  for  the  cost  of  the  engine? 

"Multiple  Realities":  There  are  k  (typically  uni-modal)  distributions,  we  generally 
know  neither  k  nor  the  individual  distributions,  but  experts  are  presumed  to  know  at 
least  one  each  to  some  degree  of  accuracy 

Example:  How  far  away  is  your  favorite  planet?  [there  could  be  up  to  9  answers  depending  on 
the  inclusion  of  Pluto  and  Earth!] 

Example:  How  big  is  a  panda?  [there  is  a  lesser  panda  and  a  greater  panda,  but  we  don't 
happen  to  know  that  and  fail  to  specify] 

Example:  What  is  the  cost  risk  for  the  engine  on  the  F-35?  [There  is  a  main  and  an  alternate 
engine,  each  has  a  range] 

This  problem  may  seem  silly,  but  it  is  not,  and  our  choice  of  conflation 
methods  depends  on  the  case  we  believe  to  apply 


We  will  recommend  approaches  for  both,  but  first,  decide  which  case  applies 


The  amount  of  spread  in  your  expert  testimony  will  give  you  an  idea  whether 
single  or  multiple  reality  is  more  likely 

We  recommend  against  feedback  or  "drilling  down"  until  after  testimony  is  gathered 
because  witnesses  are  notoriously  vulnerable  to  witness  leading,  anchoring  and  all 
other  sorts  of  mischief  ...  you'll  never  know 
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►  Desiderata  for  Single  and  Multiple  Reality  Cases 


►  Each  case  dictates  different  characteristics  for  the 
conflation  technique 

►  Single  reality: 

Best  estimate  for  the  mean 
Best  estimate  for  the  dispersion 
Best  estimate  for  the  distribution 

►  Multiple  Realities 

Best  portrayal  of  the  multiple  choices  we  are  confronted  with 

We  will  discuss  each  in  turn 


A 
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►  The  Preferred  Methods 


We  will  describe  the  apparent  preferred  solution  for  each 
method  after  asserting  them  below 

►  Single  Reality: 

Average  the  parameters  and  correct  for  the  understatement  of 
extrema  (using  method  lb  or  2a  from  an  earlier  slide) 

►  Multiple  Realities 

Sample  from  the  experts  after  correcting  each  for  understatement 
of  the  extrema 

►  If  we  cannot  discern  whether  we  are  in  Single  Reality  or 
Multiple  Realities,  we  recommend  sampling 

Because  this  is  more  conservative,  meaning  it  will  have  wider 
dispersion 


We  reject  the  use  of  averaging  answers  on  each  iteration 
despite  having  used  the  method  in  a  Best  paper  Overall8  in 
1998.  To  see  why,  we  will  show  its  characteristics  and 
indicate  why  it  is  probably  unsuitable. 


§.  Risk  Analysis  of  a  Major  Government  Information  Production  System ,  Expert-Opinion-Based  Software  Cost  Risk  Analysis  Methodology,  N.  L.  St.  Louis,  F.  K.  Blackburn,  R.  L.  Coleman_SCEA/ISPA 
International  Conference  1998,  ADoDCAS  1998,  Journal  of  Parametrics,  June  1998,  Awarded  DoDCAS  Outstanding  Contributed  Paper  and  Overall  Best  Paper  Award  SCEA/ISPA 


Richard.  lewis.coleman@qmail.cc 


,  703-615-4482,  13 


©2010TASC,  Inc. 


TASC 


►  Recommendation  -  Single  Reality 


►  The  mean  of  the  single  reality  not  troublesome,  almost 
any  reasonable  approach  will  yield  the  same  mean 

-  We  use  the  word  "reasonable"  with  trepidation 

►  The  standard  deviation  presents  the  problem,  since 
individuals  are  known  to  under-report,  and  some 
methods  are  vulnerable  to  distortions 

►  We  recommend  averaging  parameters  of  the  expert 
testimony  because  it  is  clear  what  is  happening 

►  Correct  each  expert's  testimony  for  truncation  of  the 
standard  deviation,  or  correct  the  average,  there  is  no 
obvious  difference  in  the  order  of  the  operations 

-  Techniques  for  correcting  the  standard  deviation  were  shown  in 
the  first  part  of  the  paper 
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Conflation:  Averaging  on  Each  Iteration  (la) 


Averaging  on  each  iteration  can  have  an  unexpected  result:  Three  very  different  sets  of  testimony  by 
two  experts  will  produce  exactly  the  same  picture 
This  is  not  obvious  at  first,  but  it  is  so 

The  standard  deviation  of  k  identical  but  scattered  triangles,  with  SD  =  s,  when  iteration-averaged  will 
produce  a  standard  deviation  s/Vk 

The  SD  of  the  conflation  can  be  thus  be  arbitrarily  small,  if  k  is  sufficiently  large 
This  does  not  comport  with  our  desire  that  the  SD  be  well  modeled 

Correction  for  k  can  be  achieved  by  a  spreading  with  V  k  but  this  is  likely  to  be  done  wrong  or  omitted  altogether,  and  at 
best  would  require  row-by-row  corrections 

Correction  for  expert  truncation  can  be  achieved  by  treating  the  end  points  as  if  they  were  20/80  points,  this  can  be 
done  before  or  after  conflation 


Averaging 


Two  Triangular  PDFs 


Two  Triangular  PDFs 


Two  Triangular  PDFs 


ooooooooooooooooooooo 


o  m  id  oi 
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►  Conflation:  Averaging  on  Each  Iteration  (la) 


We  conclude  that  averaging  on  each  distribution  has 
some  good  and  bad  characteristics,  but  on  the  whole 
is  not  desirable 

►  It  produces  a  good  confidence  interval  for  the  mean 
of  the  experts,  but  this  is  not  what  we  want 

-  We  already  know  the  mean  of  the  experts,  the  point  estimate 
is  the  simple  average  of  the  means  of  each 

-  What  we  really  want  is  the  full  range  of  the  possible  outcomes, 
but  averaging  on  each  iteration  does  not  do  this,  instead  it 
shrinks  the  answer 

-  By  analogy,  this  is  the  same  problem  as  the  confidence 
interval  for  a  CER  ...  it  bounds  the  line,  but  not  the  data  ... 
what  we  really  want  is  the  prediction  interval 

-  It  is  only  a  candidate  (and  flawed  at  that)  for  clear  cases  of 
single  reality 
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Conflation:  Averaging  Parameters  (2a) 


Averaging  parameters  provides  simple  results:  Three  very  different  sets  of 
testimony  by  two  experts  produces  exactly  the  same  picture 

The  standard  deviation  of  k  identical  but  scattered  triangles,  with  standard 
deviation  of  s,  when  iteration-averaged  will  produce  a  standard  deviation  of  s 
The  SD  of  the  conflation  will  not  vary  with  k 

Correction  can  be  achieved  by  a  spreading  with  Vk  but  this  is  likely  to  be  done  wrong 
or  omitted  altogether,  and  at  best  would  require  row-by-row  corrections 


►  Conflation:  Averaging  Parameters  (2a) 


►  We  conclude  that  averaging  parameters  has  some 
good  and  bad  characteristics,  but  on  the  whole  is 
simple  and  wieldy 

It  produces  good  estimates  of  the  mean  and  the  standard 
deviation 

-  It  is  insensitive  to  scatter  of  expert  testimony,  so  is  only 
useable  in  clear  cases  of  single  reality 

Correct  the  parameters  as  shown  earlier  because  each  expert 
is  likely  to  truncate 

The  order  of  the  operations  does  not  matter 


A 
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►  Conflation:  Sampling  (3) 


"Average"  the  probability  distributions  of  the  k  experts,  using 
one  of  two  schemes,  depending  on  the  speed  implications  and 
the  ease  of  implementation  in  your  model: 

Put  all  the  distributions  in  the  mix,  and  scale  each  by  1/k,  creating  a 
(probably)  multi-mode  custom  distribution9 
We  will  see  this  pictorially  on  the  next  slide 

Characterize  each  of  the  k  distributions  and  choose  a  first  random  number 
to  select  which  expert  distribution  to  use  for  each  run  of  the  Monte  Carlo 
and  a  second  random  number  to  draw  from  that  expert's  distribution10 

The  two  above  methods  are  mathematically  identical 

►  The  resulting  distribution  will  have  two  characteristics: 

A  better  estimate  of  the  mean  and  generally  better  predictive 
performance  than  other  conflation  schemes9 

A  wider  (actually,  "not  narrower")  standard  deviation  for  the  conflated 
result  than  those  of  the  original  individual  distributions 

We  don't  know  the  degree  to  which  sampling  will  correct  dispersion, 
although  the  more  experts  the  wider  the  dispersion 

We  plan  to  attempt  a  study  of  this 

We  will  give  a  demonstration  of  this  effect  with  representative  data 


9.  An  experiment  in  Probabilistic  Forecasting ,  Thomas  A.  Brown,  R-944-ARPA,  July  1973 

10.  Determining  the  Cost  of  the  Certification  and  Accreditation  Process  using  Expert  Opinion  and  Monte  Carlo  Simulation ,  A. 
J.  Flynn,  B.  J.  Nethery,  K.  Thomas,  A.  E.  Gerstner,  B.  Dickey,  C.  M.  Kanick,  and  P.  J.  Braxton,  SCEA  2010 
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►  Conflation:  Sampling  (3) 


To  conflate  two  triangular  distributions,  "average"  them 


The  first  distribution 


The  second  distribution 


The  conflated  (averaged)  distribution 
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►  Sampling  of  Two  Triangles  -  PDF 


1  These  charts  portray  the  conflation  of  two  triangles  as  the  respective  experts  who 
estimated  them  come  into  alignment 

Each  original  individual  triangle  is  symmetric,  has  a  base  length  of  200,  and  a  standard  deviation  of  40.8 
Conflation  is  done  by  averaging  the  two  PDFs  (also  described  as  sampling) 

The  two  triangles  move  closer  in  such  a  way  that  the  conflated  mean  remains  constant 

We  maintained  the  same  conflated  mean  of  200 

We  kept  the  conflated  mean  constant  to  allow  us  to  discuss  the  CV  in  a  meaningful  way 

When  the  two  triangles  merge,  we  get  a  triangle  that  has  the  height  and  width  of  each  individual  triangle 

before  conflation 


The  standard  deviation  of  the  conflated  distribution  will  be  shown  on  the  next  graph 
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►  Conflation  of  Two  Triangles  -  CV  and  SD 


As  two  triangular  PDFs  move  closer,  the  conflated  standard  deviation  and  CV  drop 
until  the  triangles  merge  and  achieve  the  same  standard  deviation  as  that  of  each 
triangle 

Since  we  chose  to  maintain  the  mean  of  the  conflation  at  200,  the  CV  drops 

The  unsettling  conclusion  is  that  the  CV  of  conflated  expert  opinion  can  be 
uncontrollably  large,  depending  on  how  far  apart  their  triangles 

The  standard  deviation  of  two  identical  triangles  separated  by  distance  2d  can  be 
shown*  to  be  V(o2+d2) 

*We  aren't  saying  it's  easy  ...  this  phrase  usually  means  the  Professor  is  too  lazy  to  show  you  or  too  kind  to  bore  you ,  and  the  former  is  by  far  the  more  likely!  We're  the  latter,  the  proof  is  in  backup 


Standard  Deviation  vs.  %  Overlap 
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CV  vs.  %  Overlap 
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Standard  Deviation  vs.  Separation  Distance 
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Distance 

♦  Monte  Carlo  □  Theoretical 


►  The  Dispersion  of  Sampled  Distributions 


Let: 

ct  =  SD  of  the  underlying  risk 

Se  =  SD  for  the  individual  experts  (we  think  it  is  about  Vi  ct) 

Sm  =  SD  for  the  meta  distribution  of  the  experts  opinions 
Sc  =  SD  of  the  conflation 

Then, 

if  Se  =  0,  then  Sc  =  Sm 
if  Sm  =  0,  then  Sc  =  Se 

And,  further 

Sc  >  max(Se,  Sm) 

This  also  implies  that  if  Se  is  corrected  to  ct,  Sc  exceeds  ct 

We  have  shown,  in  backup,  that  once  the  experts  have  produced  k 
triangles,  then: 

sc  =  V(se2+st2) 

where  St  is  the  calculated  sum  of  the  squares  of  the  differences  of 
the  k  triangles  from  their  means.  We  have  yet  to  prove  that: 

Sc  =  V(Se2+Sm2) 

But  we  believe  it  to  be  true 
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►  Thoughts  on  the  Distribution  of  Expert  Opinion 


in  the  distribution  of  costs,  but  will 
be  estimating  the  distribution  based  on  the  outcomes  they 
have  experienced  and  perhaps  some  hearsay 

2.  Experts  are  most  likely  to  be  technical  people,  not  cost 
estimators,  so  will  have  experience  in  a  handful  of  projects 
and  hearsay  of  somewhat  larger  number 

►  Implications 

Experts  will  perceive  a  mean  (and  perhaps  the  mode?) 
according  to  Chebyshev's  inequality  or  a  confidence  interval 
bounded  by  o/(Vn),  at  best 

Where  n  is  the  number  they  have  observed 

2.  Experts  will  perceive  a  standard  deviation  (and  thus  perhaps 
the  extrema  of  a  triangle?)  as  a  variance  a  times  a  chi-square 
(n)  divided  by  n,  at  best 

The  above  do  not  yet  consider  the  implications  of  truncation 
of  the  value  of  a 
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►  Combining  Corrections  for  Extrema  and 


We  have  shown  that  individual  distributions  can  be  corrected 
for  a  consistent  pattern  of  understatement 

►  We  have  shown  that  sampling  of  multiple  experts  will  improve 
the  mean  and  widen  the  spread 

But  we  don't  have  a  good  sense  of  how  much  the  spread  will  be  improved 

►  The  implication  of  the  two  above  statements  is  that  we  should 
not  endeavor  to  both  expand  and  sample  expert  distributions 

If  we  correct  the  individual  distributions,  we  will  have  the  dispersion  "about 
right",  if  we  then  sample  them,  we  will  have  a  dispersion  that  exceeds 
"about  right" 

►  So,  for  "single  reality",  do  one  or  the  other  but  not  both 

Expansion  of  a  single  distribution  focuses  on  the  dispersion 
Sampling  of  diverse  experts  focuses  on  getting  the  mean  right 

Since  we  generally  recommend  correcting  lower  order  moments  first11, 
conflation  is  the  priority 


►  Conflation:  Sampling 


►  Sampling  of  each  distribution  has  excellent 
characteristics 

-  It  replicates  what  the  experts  told  us  exactly 

►  It  has  a  problem  in  use  for  a  single  reality  situation 
because  the  standard  deviation  is  not  easily 
correctible  for  scatter  nor  is  it  useable  without 
correction 

We  can  easily  correct  each  expert's  testimony  for  truncation 

-  But  we  cannot  undo  the  growth  caused  by  expert  scatter, 
which  is  theoretically  unbounded  ...  the  adjustment  would  be  a 
function  of  k,  the  number  of  experts  and  has  yet  to  be 
ascertained 

►  We  conclude  that,  despite  its  popularity  in  the 
literature,  the  sampling  technique  is  too  tricky  in  a 
single  reality  case  and  should  not  be  used 
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►  Recommendation  -  Multiple  Reality 


►  The  mean  of  the  multiple  reality  is  not  troublesome, 
almost  any  reasonable  approach  will  yield  the  same 
mean 

Again  that  dangerous  word  "reasonable"! 

►  The  standard  deviation  does  not  present  as  much  of  a 
problem  in  a  multiple  reality  case  because  we  believe 
each  expert,  like  the  six  blind  men,  sees  a  piece  of 
the  truth 

►  Use  sampling 

►  Correct  each  expert's  testimony  before  sampling,  you 
cannot  easily  correct  it  afterwards,  order  matters 
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An  Actual  Case  Study 
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►  The  SME  Data 


►  Actual  SME  data  was  collected  on  a  number  of 
subtasks 

►  Each  SME  was  providing  estimates  of  the  same  tasks 
without  collaboration 

►  The  data,  while  not  strictly  pathological,  was 
sufficiently  different  to  provide  a  good  test  of  our 
findings 

►  Our  paper  was  written  for  this  study,  but  our 
methodology  development  was  divorced  from  the 
data  until  the  end 

►  The  data  source  is  sufficiently  obscured,  by  a  single 
linear  transformation,  to  prevent  traceback 
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►  The  Original  Data 


The  transformed  source  data  shows  a  dispersion  of  opinion 

►  It  was  unclear  whether  this  was  a  case  of  multiple  reality 

The  study  authors  concluded  that  it  might  be,  so  they  chose  sampling 

►  We  will  compute  the  results  from  all  the  methods  we 
examined  and  plot  the  results  of  the  two  methods  we  selected 


SME  1 


•SME  2 


SME  3 


SME  4 


SME  5 


SME  1 


•SME  2 


SME  3 


SME  4 


SME  5 
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►  Moments  of  the  Postulated  Methods 


1  Methods  recap 

la  Average  the  results  of  each  SME  on  each  run 

lb  Same  as  la  with  correlation  =  1.0  (same  as  2a  below  for  symmetric,  a  bit  different  for  skewed) 

2a  Average  the  parameters  of  the  SMEs  (use  the  average  of  the  means  or  the  average  of  the  modes) 

2b  Min  of  the  mins,  average  of  the  modes  and  max  of  the  maxes 

2c  Min  of  the  mins,  average  of  the  means  and  max  of  the  maxes 

3  Sampling  (equivalent  to  averaging  PDFs) 

As  we  expected,  the  means  are  all  almost  all  the  same 

Method  2b  used  averaged  modes,  so  the  mean  is  not  preserved 

Method  2c,  an  attempt  to  salvage  2b,  used  average  means  but  routinely  returned  modes  below  the  min  so  was 
unusable 


As  we  expected,  the  standard  deviation  is  the  parameter  that  responds  to  our  choices 


SD  of  la  was  "too  small" 

SD  of  2b,  the  rejected  2c  and  3  were  "too  big" 
The  SD  of  2a  was  "Goldilocks" 


la  lb  2a  2b  2c  3  la  lb  2a  2b  2c  3 


Legend: 

Mean,  SD 


lb  B  2a  12b  12c  3  ■  la  ■  lb  ■  2a  ■  2b  ■  2c  ■  3 
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►  Graphs  of  the  Two  Recommended  Methods 


The  "Sand  Chart"  shows  sampling,  the  preferred  method  for  multiple  realities 

It  retains  ail  the  information  told  to  us  by  the  SMEs  equally 

It  suggests,  in  this  example,  that  there  may  be  three  different  modes,  representing  3  different  possibilities 

The  "Line  Chart"  shows  averaged  parameters,  the  preferred  method  for  single  reality 

It  responds  to  all  SMEs,  but  produces  a  uni-modal,  less  dispersed  solution 

It  suggests,  in  this  example  that  SME  1  was  too  low  while  SMEs  3  and  4  were  a  bit  pessimistic  on  the  high 
end 

Both  methods  are  credible  and  both  do  a  decent  job  of  synthesizing 
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►  Conclusion  for  the  Conflation  of  Experts 


IAs  asserted,  we  have  illustrated  that  the  averaging  of  parameters  for  k 

triangles,  is  equivalent  to  averaging  of  draws  from  those  k  triangles  with  a 
single  draw  of  a  random  number  used  to  simulate  expert's  draw,  and  then 
averaging  the  draws 

We  have  demonstrated  why  those  two  equivalent  methods  give  the  simplest 
and  clearest  result  for  Single  Reality  and  seem  the  best  representation  of  what 
the  k  experts  seem  to  have  meant 

We  have  shown  why  Sampling  of  k  experts  gives  the  best  representation  of 
what  the  k  experts  seem  to  have  meant  in  the  case  of  Multiple  Realities 

We  presented  a  case  study  with  actuals  that  shows  that  the  two  recommended 
approaches  do  a  decent  job  of  synthesizing  what  the  SMEs  told  us 

The  issue  of  deciding  between  Single  and  Multiple  Realities  remains  the  most 
difficult  issue 

Sometimes  it  will  be  as  simple  as  learning  that  each  expert  has  in  mind  "a  different 
engine" 

Sometimes  it  will  be  a  concession  to  the  wide  dispersion  and  the  recognition  that  there 
"must  be  a  reason." 

We  will  now  move  to  a  different  topic,  that  of  correcting  mischaracterization  of 
distributions,  without  which  this  paper  would  seem  incomplete 
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Correcting  the  (Mis)characterization  of 

Distributions 


▲ 
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►  The  Problem 


■  "Experts"  who  may  know  a  lot  about  the  technical  issues, 
and  maybe  even  the  cost  of  them,  will  not  necessarily  be 
well  versed  in  probability 

Consequently,  the  characterizations  they  will  produce  will  not  be 
easily  used  and  will  sometimes  be  incoherent  (meaning,  internally 
contradictory) 

►  Expert  testimony  in  risk  analysis  should  be  accorded  the 
same  respect  that  cost  data  is  in  cost  analysis 

Tenet  1:  "Do  no  harm"  meaning  preserve  as  much  of  what  the 
expert  said  as  is  possible  in  achieving  coherence 

Tenet  2:  Preserve  lower  order  moments  above  higher  order 
moments 

Tenet  3:  If  particular  aspects  are  more  important  than  others, 
preserve  those  aspects  (e.g.,  if  the  variability  or  upper  percentiles 
are  the  focus,  accord  those  greater  priority) 

►  It  is  preferable  to  make  the  corrections  with  direct 

feedback  to  the  expert,  but  this  feedback  should  be  done 
under  the  same  precepts  as  the  corrections 

Meaning,  follow  the  tenets  in  your  persuasions  and  probing 
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►  Implausible  Percentiles 


"The  20/50/80  are  $0.0M/$0.9M/$3.6M" 


► 

1. 

2. 


3 


No  triangle  can  fit  this,  and  the  distribution  is  wildly  skewed, 
so  simplifying  steps  were  taken: 

Assume  that  the  stated  "50%-ile"  is  really  the  mode 

Take  the  20  and  80  as  "about  true",  and  assume  they  are  ±o.  Use  the  rule 
that  the  half-base  lengths  of  a  symmetric  triangle  are  V6 *o.  Note  that  these 
triangles  are  not  symmetrical,  but  use  it  as  a  factor  that  probably  does  a 
decent  job 

Results: 


Input 
20%-ile  0 
50%-ile  0.9 
80%-ile  3.6 


Output 
L  -1.305 

M  0.900 

H  7.514 


Note  that  the  correction  may  be  distorting  the  central 
tendency 

But,  this  distribution  is  clearly  intended  to  be  skewed,  and  the  mean  is 
therefore  above  the  median 

We  cannot  actually  compute  the  mean  with  the  information  given 

We  also  knew  that  in  this  analysis,  the  ROS  at  the  80th  percentile  was  a 
particular  focus,  so  we  felt  that  preservation  of  that  point  should  take 
priority  (Tenet  3) 
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►  Unlikely  distributions 


►  Risk  values: 

-  20%  probability  of  -$2M 

-  40%  probability  of  $0 

-  20%  probability  of  +$4M 

►  Suspecting  that  this  was  a  just  clumsy  way  to 
characterize  a  triangle,  we  asked  if  a  triangle  with  the 
below  characteristics  was  along  the  lines  of  what  the 
expert  meant: 

-  20%-ile  -$2M 

-  Mode  0M 

-  80th  %-ile+$4M 

...  the  expert  agreed  readily  that  the  precise  distribution  wasn't 
what  he  meant,  and  the  triangle  captured  the  sense  of  it. 
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Errors  Of  Characterization  Induced  by  the  Risk 
Analyst 


Categorical*  risk  distributions 

Many  risk  models  cannot  easily  (or  rather  obviously)  implement  a  categorical 
random  variable  beyond  a  Bernoulli 

Many  can  do  it,  most  analysts  don't  realize  they  can 

For  a  3-value  categorical,  with  choices  of  0,  a  and  b,  many  analysts  implement  it 
as  two  independent  Bernoullis  with  values  of  0  or  a  and  0  or  b 

This  results  in  an  error  as  the  results  are  not  the  same  ...  the  two  Bernoullis  can 
turn  out  as  a  and  b  at  the  same  time,  but  the  original  formulation  prohibits  that 

Either  implement  it  as  a  categorical  or  create  two  Bernouli's  with  the  right 
characteristics 


Triangular  risk  distributions 

Sometimes  the  end  points  are  set  at  the  standard  deviation  of  the  formulation 

Sometimes  triangles  are  used  instead  of  normals,  even  when  the  normal  was 
proposed,  out  of  aversion  to  negative  outcomes 

In  practice,  negative  outcomes  are  harmless  in  Monte  Carlo 
Negative  outcomes  ought  to  be  fairly  rare  anyway 


Normals 

Sometimes  triangles  are  substituted  incorrectly  (see  above) 

If  the  mean  and  standard  deviation  are  captured  correctly  there  is  little  harm 

Sometimes  the  negative  portion  of  the  normal  is  truncated  despite  that  this 
causes  a  shift  of  the  formulated  mean  and  a  reduction  in  the  standard  deviation 


Categorical  risk  distributions  are  like  Bernoullis  but  allow  2  or  more  values  (the  Bernoulli  is  a  member  of  the  family) 
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►  Conclusion  for  Correcting  Mischar 

acterization  of 

Distributions 

(►  We  have  presented  tenets  by  which  apparent  errors  of 
characterization  may  be  corrected  and  have  listed  the 
most  common  Risk-Analyst-induced  errors 


►  We  finish  by  reiterating  that  the  testimony  of  the 
experts  we  consult  should  be  handled  much  as  we 
should  handle  data 

-  We  must  be  careful  in  not  ignoring  the  symptoms  of  the 
testimony,  and  avoid  such  elementary  errors  as  causing 
anchoring*  and  "leading  the  witness." 

We  should,  nonetheless  carefully  repair  any  clear  errors  caused 
by  the  unfamiliarity  with  probability  that  can  result  in  unlikely 
distributions 
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►  Final  Thoughts 


The  conflation  of  expert  testimony  has  received  some 
attention  in  the  literature,  but  little  to  none  of  the  conclusions 
seem  to  have  permeated  the  cost  risk  discipline 

►  We  hope  that  we  have  provided  a  reasonably  thorough  paper 
by  which  risk  Analysts  might  be  guided 

►  We  also  hope  that  we  have  provided  a  few  good  tenets  fr 
correcting  mischaracterization,  along  with  some  illustrative 
(actual)  examples. 


We  hope  to  be  able  to  take  on  the  issue  of  what  we  call  the 
meta-distribution,  the  likely  distribution  of  individual  expert 
testimony 

Without  a  good  model  for  the  meta-distribution,  the  full  demonstration  of 
the  best  answers  will  remain  incomplete,  because  the  meta-distribution  is 
the  unseen  ground  truth  against  which  these  answers  can  be  measured 

Until  we  can  be  satisfied  we  have  the  meta-distribution,  we  are  confined  to 
showing  the  behavior  of  various  methods  and  deciding  if  that  behavior 
seems  correct 
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Backup 
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►  The  Geometry  of  Symmetric  Triangles 


For  a  symmetric  Triangle(L,  M,  H),  where  M-L  =  H-M 

►  Find  points  I  and  h  such  that  I  and  h  are  the  pth  and  l-pth 
percentiles 

If  l-L  =  1/2*(M-L),  H-h  =  1/2*(H-M),  then  p  =  l/(2*22)  =  1/8  =  12.5% 

If  l-L  =  1/3*(M-L),  H-h  =  1/3*(H-M),  then  p  =  l/(2*32)  =  1/18  =  5.6% 

pth  percentile  ->  V(p/2)  base  fraction  ->  V(2p)  half-base  fraction 

So,  the  20th  percentile  ->  1/5  occurs  at  point  V(l/10)  =  0.3162  base  fraction 


►  Triangles  With  Related  Areas 


■  ►  We  wish  to  know  how  to  draw  triangular  distributions  that 
are  related  to  one  another: 

^  ►  Constant  area: 


Used  in  expansion  of  experts  (correcting  understated  variance) 

For  area  to  remain  constant,  in  this  case  A  =  1,  as  the  base 
increases  by  a  factor,  the  height  must  be  multiplied  by  the 
reciprocal  of  that  factor  i  i  (  l  \ 

A  =  —bh  =—{bk\  - 

►  Reduction  in  area:  2  2  \k) 


For  area  to  be  reduced  by  a  factor,  the  dimensions  of  a  similar 
triangle  must  be  reduced  by  the  square  root  of  that  factor 

1  .  1  ,  ,  1  (  bx  Y  A,  ^ 


A1  =  —  A,  = — hh  =  — 
k  2k  2 


V 


4k  A  4k 


J 


For  area  to  be  reduced  by  a  factor,  the  height  must  be  reduced  by 
that  factor  if  the  base  is  to  remain  constant 

Used  in  sampling  of  experts.  _  1  1  ,  ,  _  1 

2~~k  l~U  1  1  ~  2  1 


J 
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►  Correction  of  Understated  Variance  for  Triangles 


►  For  symmetric  triangles 

-  To  expand  from  20-80  to  Min-Max,  multiply  by  2.72  =  1/0.368 

-  V(  1/10)  =  0.3162  base  fractior 

-  V( 2/5)  =  0.6325  half-base  frac 


To  expand  from  plus-or-minus-one-sigma  to  Min-Max,  multiply  by 
2.45  (V6) 

-  (V6-1)/2V6  =  0.2959  base  frac 

-  (V6-l)/V6  =  0.5918  half-base  f 

-  Compare  with  68.3%  within 
one  sigma  rule  of  thumb  for 


Normal  distribution 


▲ 


►  Triangular  Distribution  -  PDF  and  Mean 


For  Triangle(L,M,H)  ,  denote  L=a,  H=b,  ML=c  byT(a,c,b) 

►  Since  the  area  of  the  triangle  must  be  1  (100%),  the  height  is 
twice  the  reciprocal  of  the  base 

We  can  then  derive  the  PDF  by  using  similar  triangles 
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—  C  H - H - (2  —  CLC  —  CL  +  O  +  DC - D  — 

3  3  3  3 


1 


b-a 


bc-ac  b2-a2 
- + - 


a  +  b  +  c 
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►  Triangular  Distribution  -  Variance 


cr  =4(X-//)2]=4^2)-// 


e(x2)=  \bx2p(x)dx=  I"'  — — —  — — —  dx  +  [b—  -  -dx  = 

7  *’a  b  —  a  C  —  a  Jc  h  —  n  h  —  n 


b  2x2  b-x 


b-a b-c  b  a 


b-a 


1  4  2  3 

C  2  3.  1  4 

b 

—  x  — x  a 

—  x  b  —  x 

2  3 

I  3  2 

c-a 

b-c 

a 

c 

c  +  etc  +  Cl  C  +  Cl 


1 )-  J  (c2a  +  a2c  +  a3 )+  J  ( b 3  +  b2c  +  be 2 )-  (z>3  +  b2c  +  be 2  +  c3 ) 


^  | 

=  — (c  -^rbc^rcic^rb  +  ctb  +  $  j - (c  +  be  +  +  b  ctb ct  j  = 

3  2 


+  Z?  +  c  +  ab  +  $c  +  be 


f 


M  = 


a  +  b  +  c 


\2  „2  ,,2  ,2 


$  +  Z>  +  c  +  2ab  +  2ac  +  2bc 


J 


e(x2)-//  = 


3d  +  3Z?  +  3c  +  3  cib  +  3$c  +  3Z?c  2$  +  2  b  +  2c  +  +  4$c  +  4Z?c 


18 


18 


a2  +b2  +c2  -ab-ac -be 

b  -2ab  +  a  -he  -hub  —  uc  —  be 

{b  -  af  —{b  —  c\c  -  a) 

18 

18 

18 

Square  of  the  base  minus 
product  of  the  half-bases! 

▲ 
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►  Substituting  a  Triangular  for  a  Normal: 
The  a/6  Factor 


For  a  symmetric  triangle,  let  ML  =  m,  L  =  m-w,  H  =  m+w,  where  w  is 
the  half-base 

Then  the  mean  is  m,  and  the  variance  is  w2/6 

It  follows  that  the  half-base  is  greater  than  the  standard  deviation  by  a 
factor  of  V 6 

To  approximate  a  normal,  N(p,  a)  the  factor  of  V 6  is  multiplied  by  the 
standard  deviation  of  the  normal  to  be  emulated  to  produce  the  half- 
base 

By  this  means,  end  points  are  found 
that  will  produce  a  triangular  distribution 
that  emulates  the  underlying  normal  in 
mean  and  standard  deviation 


This  triangular  distribution, 
Triangular(p  -V6o,  p,  p  +V6o)  differs 
from  the  underlying  normal  in  all 
other  moments,  and  at  all  percentiles 
other  than  the  median  and  two 
"cross-over"  points,  but  the 
difference  is  minor 
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►  Variance  of  Hybrid  Distributions  - 


A  Pythagorean  Relationship 


Suppose  k  distributions  with  pdf  P|(Xj),  mean  |jj,  and 
standard  deviation  Oj  are  sampled 

►  Then  the  pdf  of  the  hybrid  distribution  is  the  "average"  of 
the  pdfs  i* 

P\x)  =  TluP>\xi) 

k  tr 


► 


► 


The  mean  of  the  hybrid  distribution  is  the  average  of  the 


means 


|  k 

/'  =  E(x )  =  -  X  f  X,P,  (*,■  K  = 
k  w  J 


i=\ 


The  variance  of  the  hybrid  distribution  is  the  average  of  the 
variances  plus  the  variance  of  the  means  taken  as  a 
discrete  probability  distribution! 


See  next  slide  for  derivation 
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►  Variance  of  Hybrid  Distributions  - 
A  Pythagorean  Relationship 


k  i= i 


i=_ 1 _ 

k 


In  the  special  case  of  two  congruent  distributions  with 
centers  at  m-d  and  m+d,  the  variance  is 

(rn  —  d  )  +  (rn  +  d  ) 


=  <j  + 


-  m 
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►  Equivalence  of  Averaging  Distributions  and 
Averaging  Parameters  for  Symmetric  Triangles 


In  the  case  of  symmetric  triangles,  averaging  the  individual  triangles  (with 
perfect  rank  correlation)  -  method  lb  -  can  be  shown  to  be  equivalent  to 
averaging  the  parameters  -  method  2a 

We  will  prove  it  in  the  case  of  two  triangles,  but  the  proof  can  easily  be  extended  to  more 

As  previously  shown,  the  pth  percentile  (p<0.5)  for  a  symmetric  triangle  is 
at  the  V(2p)  half-base  fraction 


So  the  pth  percentiles  of  the  two  triangles  and  their  average  are: 


ax  +  yj2p(cx  -ax)  a2  +  ^2p(c2  —a2) 


ax  +a: 


+ 


(C1  ~a\)+iC2  ~ai) 


But  this  is  clearly  just  the  pth  percentile  of  the  average  distribution 


r  ax  +  a2  ^ 

+  ^2p 

( cx  +c2 ^ 

r  ax  +  a2  ^ 

v  2  j 

2  J 

2  ) 

A  similar  proof  works  for  p>0.5 

Since  all  percentiles  are  equal,  the  resulting  distributions  are  identical 


Monte  Carlo  simulation  could  be  used  to  explore  the  difference  between 
the  two  methods  for  asymmetric  triangles,  but  it  is  not  expected  to  be 
large 
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Equivalence  of  Averaging  Means  and  Averaging 
Modes  for  Triangles 


If  we  average  parameters  -  method  2a  -  as  long  as  we  average  mins  and 
maxes,  it  doesn't  matter  whether  we  average  means  or  modes 

Algebraically  equivalent 

Any  number  of  triangles,  symmetry  not  required 


Let  the  kth  triangle  be  T(aj,  q,  bj),  and  parameter-averaged  triangle  be 
T(A,  C,  B),  where 


A  =  -^— 


C  =  -=!— 


B  =  -^— 


This  is  averaging  the  modes;  the  resulting  mean  is 

±ai+±bi+±ci 


A  +  B  +  C 


i= 1 


i=\ 


i= 1 


i=\ 


J 


3k 


which  is  just  the  average  of  the  means! 

Reversing  the  flow,  averaging  the  means  can  be  shown  to  produce  a  mode 
which  is  the  average  of  the  modes 
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